Account for memory usage in SortPreservingMerge (#5885) #6382

tustvold · 2023-05-18T15:59:25Z

Which issue does this PR close?

Closes #5885

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

tustvold · 2023-05-18T16:00:01Z

datafusion/core/src/physical_plan/sorts/sort.rs

+            MemoryConsumer::new(format!("ExternalSorterMerge[{partition_id}]"))
+                .register(&runtime.memory_pool);
+
+        merge_reservation.resize(EXTERNAL_SORTER_MERGE_RESERVATION);


I take it as a positive sign that this was required to make the spill tests pass, without this the merge would exceed the memory limit and fail

tustvold · 2023-05-18T16:02:07Z

datafusion/core/src/physical_plan/sorts/sort.rs

 use tokio::task;

+/// How much memory to reserve for performing in-memory sorts
+const EXTERNAL_SORTER_MERGE_RESERVATION: usize = 10 * 1024 * 1024;


I'm not a massive fan of this, but this somewhat patches around the issue that once we initiate a merge we can't then spill

The problem with this approach is that even 10MB may not be enough to correctly merge the batches prior to spilling. So some queries that today would succeed (though exceed their memory limits) might fail.

It seems to me better approaches (as follow on PRs) would be:

Make this a config parameter so users can avoid the error by reserving more memory up front if needed

teach SortExec how to write more (smaller) spill files if it doesn't have enough memory to merge the in memory batches.

However, given the behavior on master today is to simply ignore the reservation and exceed the memory limit this behavior seems better than before.

I suggest we merge this PR as is and file a follow on ticket for the improved behavior

tustvold · 2023-05-18T16:02:43Z

datafusion/execution/src/memory_pool/pool.rs

    fn unregister(&self, consumer: &MemoryConsumer) {
        if consumer.can_spill {
-            self.state.lock().num_spill -= 1;
+            self.state.lock().num_spill.checked_sub(1).unwrap();


Drive by sanity check (as first version of MemoryReservation::split would unregister the same consumer multiple times) and the debug checks are the only reason I noticed 😅

Maybe it would be worth adding some unit tests to the MemoryReservation now given it is growing in sophistication

alamb

Thank you @tustvold -- I reviewed this code carefully and it makes sense to me.

However, when I ran the reproducer from https://github.com/influxdata/influxdb_iox/issues/7783 locally with this DataFusion patch IOx still exceeds memory significantly. I will update more there.

While of course, there are improvements that could be made I think it is a significant improvement.

alamb · 2023-05-18T17:15:16Z

datafusion/core/src/physical_plan/sorts/sort.rs

+        self.merge_reservation.free();
+
        self.in_mem_batches = self
            .in_mem_sort_stream(self.metrics.baseline.intermediate())?


I double checked that in_mem_sort_stream correctly respects self.reservation 👍

alamb · 2023-05-18T17:16:10Z

datafusion/core/src/physical_plan/sorts/sort.rs

 use tokio::sync::mpsc::{Receiver, Sender};
 use tokio::task;

+/// How much memory to reserve for performing in-memory sorts


Suggested change

/// How much memory to reserve for performing in-memory sorts

/// How much memory to reserve for performing in-memory sorts prior to spill

alamb · 2023-05-18T17:17:27Z

datafusion/core/src/physical_plan/sorts/sort.rs

+    /// Reservation for in_mem_batches
    reservation: MemoryReservation,
-    partition_id: usize,
+    /// Reservation for in memory sorting of batches


Suggested change

/// Reservation for in memory sorting of batches

/// Reservation for in memory sorting of batches, prior to spilling.

/// Without this reservation, when the memory budget is exhausted

/// it might not be possible to merge the in memory batches as part

/// of spilling.

alamb · 2023-05-18T17:21:06Z

datafusion/core/src/physical_plan/sorts/sort.rs

 use tokio::task;

+/// How much memory to reserve for performing in-memory sorts
+const EXTERNAL_SORTER_MERGE_RESERVATION: usize = 10 * 1024 * 1024;


The problem with this approach is that even 10MB may not be enough to correctly merge the batches prior to spilling. So some queries that today would succeed (though exceed their memory limits) might fail.

It seems to me better approaches (as follow on PRs) would be:

Make this a config parameter so users can avoid the error by reserving more memory up front if needed

teach SortExec how to write more (smaller) spill files if it doesn't have enough memory to merge the in memory batches.

However, given the behavior on master today is to simply ignore the reservation and exceed the memory limit this behavior seems better than before.

I suggest we merge this PR as is and file a follow on ticket for the improved behavior

alamb · 2023-05-18T17:23:23Z

datafusion/core/src/physical_plan/sorts/cursor.rs


    rows: Rows,
+
+    #[allow(dead_code)]


I think it would help to note here in comments why the code needs to keep around a field that is never read (dead_code). I think it is to keep the reservation around long enough?

alamb · 2023-05-18T17:28:51Z

datafusion/execution/src/memory_pool/pool.rs

    fn unregister(&self, consumer: &MemoryConsumer) {
        if consumer.can_spill {
-            self.state.lock().num_spill -= 1;
+            self.state.lock().num_spill.checked_sub(1).unwrap();


Maybe it would be worth adding some unit tests to the MemoryReservation now given it is growing in sophistication

alamb · 2023-05-23T10:31:54Z

We have found another cause of the memory use in IOx downstream, but I still think this PR is valuable. Once we sort out downstream we'll try and get this one polished up and ready to go

alamb · 2023-05-24T21:48:16Z

I plan to try and help this PR over the line in the next day or two

tests don't pass yet

alamb · 2023-06-08T19:00:43Z

Converted to a draft as this PR is not ready to merge yet

alamb · 2023-07-27T20:36:19Z

Ok, i really do plan to pick this code up tomorrow and work on it

alamb · 2023-07-31T16:34:53Z

I have a new version of this code on #7130 that I am making progress

github-actions bot added the core Core DataFusion crate label May 18, 2023

tustvold commented May 18, 2023

View reviewed changes

tustvold force-pushed the sort-memory-accounting branch from 435337f to c6542e0 Compare May 18, 2023 16:17

Account for memory usage in SortPreservingMerge (apache#5885)

d180c8d

tustvold force-pushed the sort-memory-accounting branch from c6542e0 to d180c8d Compare May 18, 2023 16:41

Test fix

4e23ba6

alamb previously approved these changes May 18, 2023

View reviewed changes

alamb mentioned this pull request May 19, 2023

Minor: Improve documentation of MemoryPool #6388

Merged

alamb marked this pull request as draft June 8, 2023 19:00

alamb mentioned this pull request Jul 28, 2023

Account for memory usage in SortPreservingMerge (#5885) #7130

Merged

alamb mentioned this pull request Aug 1, 2023

Minor: show output ordering in MemoryExec #7169

Merged

yjshen closed this in #7130 Aug 9, 2023

	/// How much memory to reserve for performing in-memory sorts
	/// How much memory to reserve for performing in-memory sorts prior to spill

-    /// Reservation for in memory sorting of batches
+    /// Reservation for in memory sorting of batches, prior to spilling.
+    /// Without this reservation, when the memory budget is exhausted
+    /// it might not be possible to merge the in memory batches as part
+    /// of spilling.

Account for memory usage in SortPreservingMerge (#5885) #6382

Account for memory usage in SortPreservingMerge (#5885) #6382

Uh oh!

Conversation

tustvold commented May 18, 2023

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tustvold May 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alamb commented May 23, 2023

Uh oh!

alamb commented May 24, 2023

Uh oh!

alamb commented Jun 8, 2023

Uh oh!

alamb commented Jul 27, 2023

Uh oh!

alamb commented Jul 31, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tustvold May 18, 2023 •

edited

Loading